AITopics | automatic classification

Collaborating Authors

automatic classification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Automatic classification of stop realisation with wav2vec2.0

Tanner, James, Sonderegger, Morgan, Stuart-Smith, Jane, Mielke, Jeff, Kendall, Tyler

arXiv.org Artificial IntelligenceJun-2-2025

Modern phonetic research regularly makes use of automatic tools for the annotation of speech data, however few tools exist for the annotation of many variable phonetic phenomena. At the same time, pre-trained self-supervised models, such as wav2vec2.0, have been shown to perform well at speech classification tasks and latently encode fine-grained phonetic information. We demonstrate that wav2vec2.0 models can be trained to automatically classify stop burst presence with high accuracy in both English and Japanese, robust across both finely-curated and unprepared speech corpora. Patterns of variability in stop realisation are replicated with the automatic annotations, and closely follow those of manual annotations. These results demonstrate the potential of pre-trained speech models as tools for the automatic annotation and processing of speech corpus data, enabling researchers to 'scale-up' the scope of phonetic research with relative ease.

annotation, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.23688

Country:

North America > United States (0.28)
Europe > Austria (0.28)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.40)

Add feedback

Text classification using machine learning methods

Oancea, Bogdan

arXiv.org Artificial IntelligenceFeb-27-2025

In this paper we present the results of an experiment aimed to use machine learning methods to obtain models that can be used for the automatic classification of products. In order to apply automatic classification methods, we transformed the product names from a text representation to numeric vectors, a process called word embedding. We used several embedding methods: Count Vectorization, TF-IDF, Word2Vec, FASTTEXT, and GloVe. Having the product names in a form of numeric vectors, we proceeded with a set of machine learning methods for automatic classification: Logistic Regression, Multinomial Naive Bayes, kNN, Artificial Neural Networks, Support Vector Machines, and Decision trees with several variants. The results show an impressive accuracy of the classification process for Support Vector Machines, Logistic Regression, and Random Forests. Regarding the word embedding methods, the best results were obtained with the FASTTEXT technique.

classification, product name, representation, (15 more...)

arXiv.org Artificial Intelligence

2502.19801

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)
Asia > India (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback

Leveraging AI for Automatic Classification of PCOS Using Ultrasound Imaging

Divekar, Atharva, Sonawane, Atharva

arXiv.org Artificial IntelligenceDec-30-2024

The AUTO-PCOS Classification Challenge seeks to advance the diagnostic capabilities of artificial intelligence (AI) in identifying Polycystic Ovary Syndrome (PCOS) through automated classification of healthy and unhealthy ultrasound frames. This report outlines our methodology for building a robust AI pipeline utilizing transfer learning with the InceptionV3 architecture to achieve high accuracy in binary classification. Preprocessing steps ensured the dataset was optimized for training, validation, and testing, while interpretability methods like LIME and saliency maps provided valuable insights into the model's decision-making. Our approach achieved an accuracy of 90.52%, with precision, recall, and F1-score metrics exceeding 90% on validation data, demonstrating its efficacy. The project underscores the transformative potential of AI in healthcare, particularly in addressing diagnostic challenges like PCOS. Key findings, challenges, and recommendations for future enhancements are discussed, highlighting the pathway for creating reliable, interpretable, and scalable AI-driven medical diagnostic tools.

machine learning, natural language, text classification, (19 more...)

arXiv.org Artificial Intelligence

2501.01984

Country: Asia > India > Maharashtra > Pune (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.70)

Add feedback

Automatic Classification of General Movements in Newborns

Chopard, Daphné, Laguna, Sonia, Chin-Cheong, Kieran, Dietz, Annika, Badura, Anna, Wellmann, Sven, Vogt, Julia E.

arXiv.org Artificial IntelligenceNov-19-2024

General movements (GMs) are spontaneous, coordinated body movements in infants that offer valuable insights into the developing nervous system. Assessed through the Prechtl GM Assessment (GMA), GMs are reliable predictors for neurodevelopmental disorders. However, GMA requires specifically trained clinicians, who are limited in number. To scale up newborn screening, there is a need for an algorithm that can automatically classify GMs from infant video recordings. This data poses challenges, including variability in recording length, device type, and setting, with each video coarsely annotated for overall movement quality. In this work, we introduce a tool for extracting features from these recordings and explore various machine learning techniques for automated GM classification.

automatic classification, general movement, keypoint, (15 more...)

arXiv.org Artificial Intelligence

2411.09821

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Bavaria > Regensburg (0.06)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic Classification of News Subjects in Broadcast News: Application to a Gender Bias Representation Analysis

Pelloin, Valentin, Dodson, Lena, Chapuis, Émile, Hervé, Nicolas, Doukhan, David

arXiv.org Artificial IntelligenceJul-19-2024

This paper introduces a computational framework designed to delineate gender distribution biases in topics covered by French TV and radio news. We transcribe a dataset of 11.7k hours, broadcasted in 2023 on 21 French channels. A Large Language Model (LLM) is used in few-shot conversation mode to obtain a topic classification on those transcriptions. Using the generated LLM annotations, we explore the finetuning of a specialized smaller classification model, to reduce the computational cost. To evaluate the performances of these models, we construct and annotate a dataset of 804 dialogues. This dataset is made available free of charge for research purposes. We show that women are notably underrepresented in subjects such as sports, politics and conflicts. Conversely, on topics such as weather, commercials and health, women have more speaking time than their overall average across all subjects. We also observe representations differences between private and public service channels.

category, dataset, dialogue, (14 more...)

arXiv.org Artificial Intelligence

2407.1418

Country:

North America > Dominican Republic (0.04)
Europe > Ireland (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.85)

Add feedback

Automatic Classification of Bug Reports Based on Multiple Text Information and Reports' Intention

Meng, Fanqi, Wang, Xuesong, Wang, Jingdong, Wang, Peifang

arXiv.org Artificial IntelligenceAug-2-2022

With the rapid growth of software scale and complexity, a large number of bug reports are submitted to the bug tracking system. In order to speed up defect repair, these reports need to be accurately classified so that they can be sent to the appropriate developers. However, the existing classification methods only use the text information of the bug report, which leads to their low performance. To solve the above problems, this paper proposes a new automatic classification method for bug reports. The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report (i.e. suggestion or explanation) is also considered, thereby improving the performance of the classification. First, we collect bug reports from four ecosystems (Apache, Eclipse, Gentoo, Mozilla) and manually annotate them to construct an experimental data set. Then, we use Natural Language Processing technology to preprocess the data. On this basis, BERT and TF-IDF are used to extract the features of the intention and the multiple text information. Finally, the features are used to train the classifiers. The experimental result on five classifiers (including K-Nearest Neighbor, Naive Bayes, Logistic Regression, Support Vector Machine, and Random Forest) show that our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.

artificial intelligence, bug report, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-10363-6_9

2208.01274

Country:

Europe > Switzerland (0.04)
Asia > China > Jilin Province (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
(2 more...)

Add feedback

Automatic Classification of Sexual Harassment Cases

#artificialintelligenceOct-2-2019, 14:54:58 GMT

In our case, the data was provided by Safecity India, which is a platform launched on 2012, that crowdsources personal stories of sexual harassment and abuse in public spaces [2]. They have collected over 10,000 stories from over 50 cities in India, Kenya, Cameroon, and Nepal. More specifically they provided us a .cvs Additionally to the focal tasks of this project and as part of the NLP channel we decided to automate the category classification based on the sexual harassment case descriptions. Performing this classification task manually is time-consuming and leaving it entirely on the hands of the victim could produce ambiguity in the discrimination of the categories.

category, classification, classifier, (11 more...)

#artificialintelligence

Country:

Asia > India (0.46)
Asia > Nepal (0.25)
Africa > Kenya (0.25)
Africa > Cameroon (0.25)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Add feedback

Unsupervised automatic classification of Scanning Electron Microscopy (SEM) images of CD4+ cells with varying extent of HIV virion infection

Wandeto, John M., Dresp-Langley, Birgitta

arXiv.org Artificial IntelligenceApr-30-2019

Archiving large sets of medical or cell images in digital libraries may require ordering randomly scattered sets of image data according to specific criteria, such as the spatial extent of a specific local color or contrast content that reveals different meaningful states of a physiological structure, tissue, or cell in a certain order, indicating progression or recession of a pathology, or the progressive response of a cell structure to treatment. Here we used a Self Organized Map (SOM)-based, fully automatic and unsupervised, classification procedure described in our earlier work and applied it to sets of minimally processed grayscale and/or color processed Scanning Electron Microscopy (SEM) images of CD4+ T-lymphocytes (so-called helper cells) with varying extent of HIV virion infection. It is shown that the quantization error in the SOM output after training permits to scale the spatial magnitude and the direction of change (+ or -) in local pixel contrast or color across images of a series with a reliability that exceeds that of any human expert. The procedure is easily implemented and fast, and represents a promising step towards low-cost automatic digital image archiving with minimal intervention of a human operator.

classification, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1905.037

Country:

Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.07)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.05)
Europe > Germany > Berlin (0.05)
Africa > Kenya > Nyeri County > Nyeri (0.05)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.43)

Add feedback

Automatic Classification of Pathology Reports using TF-IDF Features

Kalra, Shivam, Li, Larry, Tizhoosh, Hamid R.

arXiv.org Machine LearningMar-5-2019

A Pathology report is arguably one of the most important documents in medicine containing interpretive information about the visual findings from the patient's biopsy sample. Each pathology report has a retention period of up to 20 years after the treatment of a patient. Cancer registries process and encode high volumes of free-text pathology reports for surveillance of cancer and tumor diseases all across the world. In spite of their extremely valuable information they hold, pathology reports are not used in any systematic way to facilitate computational pathology. Therefore, in this study, we investigate automated machine-learning techniques to identify/predict the primary diagnosis (based on ICD-O code) from pathology reports. We performed experiments by extracting the TF-IDF features from the reports and classifying them using three different methods---SVM, XGBoost, and Logistic Regression. We constructed a new dataset with 1,949 pathology reports arranged into 37 ICD-O categories, collected from four different primary sites, namely lung, kidney, thymus, and testis. The reports were manually transcribed into text format after collecting them as PDF files from NCI Genomic Data Commons public dataset. We subsequently pre-processed the reports by removing irrelevant textual artifacts produced by OCR software. The highest classification accuracy we achieved was 92\% using XGBoost classifier on TF-IDF feature vectors, the linear SVM scored 87\% accuracy. Furthermore, the study shows that TF-IDF vectors are suitable for highlighting the important keywords within a report which can be helpful for the cancer research and diagnostic workflow. The results are encouraging in demonstrating the potential of machine learning methods for classification and encoding of pathology reports.

artificial intelligence, machine learning, pathology report, (16 more...)

arXiv.org Machine Learning

1903.07406

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.55)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Automatic classification of trees using a UAV onboard camera and deep learning

Onishi, Masanori, Ise, Takeshi

arXiv.org Machine LearningApr-27-2018

Automatic classification of trees using remotely sensed data has been a dream of many scientists and land use managers. Recently, Unmanned aerial vehicles (UAV) has been expected to be an easy-to-use, cost-effective tool for remote sensing of forests, and deep learning has attracted attention for its ability concerning machine vision. In this study, using a commercially available UAV and a publicly available package for deep learning, we constructed a machine vision system for the automatic classification of trees. In our method, we segmented a UAV photography image of forest into individual tree crowns and carried out object-based deep learning. As a result, the system was able to classify 7 tree types at 89.0% accuracy. This performance is notable because we only used basic RGB images from a standard UAV. In contrast, most of previous studies used expensive hardware such as multispectral imagers to improve the performance. This result means that our method has the potential to classify individual trees in a cost-effective manner. This can be a usable tool for many forest researchers and managements.

artificial intelligence, classification, machine learning, (15 more...)

arXiv.org Machine Learning

1804.1039

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.70)

Industry:

Media > Photography (0.49)
Energy (0.37)
Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback